Uncertainty decoding on Frequency Filtered parameters for robust ASR
نویسندگان
چکیده
The use of feature enhancement techniques to obtain estimates of the clean parameters is a common approach for robust automatic speech recognition (ASR). However, the decoding algorithm typically ignores how accurate these estimates are. Uncertainty decoding methods incorporate this type of information. In this paper, we develop a formulation of the uncertainty decoding paradigm for Frequency Filtered (FF) parameters using spectral subtraction as a feature enhancement method. Additionally, we show that the uncertainty decoding method for FF parameters admits a simple interpretation as a spectral weighting method that assigns more importance to the most reliable spectral components. Furthermore, we suggest combining this method with SSBD-HMM (Spectral Subtraction and Bounded Distance HMM), one recently proposed technique that is able to compensate for the effects of features that are highly contaminated (outliers). This combination pursues two objectives: to improve the results achieved by uncertainty decoding methods and to determine which part of the improvements is due to compensating for the effects of outliers and which part is due to compensating for other less deteriorated features.
منابع مشابه
Uncertainty Decoding with Adaptive Sampling for Noise Robust DNN-Based Acoustic Modeling
Although deep neural network (DNN) based acoustic models have obtained remarkable results, the automatic speech recognition (ASR) performance still remains low in noise and reverberant conditions. To address this issue, a speech enhancement front-end is often used before recognition to reduce noise. However, the front-end cannot fully suppress noise and often introduces artifacts that are limit...
متن کاملIntegration of DNN based speech enhancement and ASR
Speech enhancement employing Deep Neural Networks (DNNs) is gaining strength as a data-driven alternative to classical Minimum Mean Square Error (MMSE) enhancement approaches. In the past, Observation Uncertainty approaches to integrate MMSE speech enhancement with Automatic Speech Recognition (ASR) have yielded good results as a lightweight alternative for robust ASR. In this paper we thus exp...
متن کاملIndividual on-line variance adaptation of frequency filtered parameters for robust ASR
In this paper we address the problem of robust speech recognition. We propose a new method based on the individual variance adaptation of frequency filtered parameters to reduce the deleterious effects of additive narrow-band noise. The method can be interpreted as a spectral weighting that assigns increased importance to the most reliable spectral components, typically the spectral peaks. The ...
متن کاملUncertainty driven Compensation of Multi-Stream MLP Acoustic Models for Robust ASR
In this paper we show how the robustness of multi-stream multi-layer perceptron (MLP) acoustic models can be increased through uncertainty propagation and decoding. We demonstrate that MLP uncertainty decoding yields consistent improvements over using minimum mean square error (MMSE) feature enhancement in MFCC and RASTA-LPCC domains. We introduce as well formulas for the computation of the unc...
متن کاملUncertainty training and decoding methods of deep neural networks based on stochastic representation of enhanced features
Speech enhancement is an important front-end technique to improve automatic speech recognition (ASR) in noisy environments. However, the wrong noise suppression of speech enhancement often causes additional distortions in speech signals, which degrades the ASR performance. To compensate the distortions, ASR needs to consider the uncertainty of enhanced features, which can be achieved by using t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Speech Communication
دوره 52 شماره
صفحات -
تاریخ انتشار 2010